33 research outputs found
Quantifying the Impact of Non-Stationarity in Reinforcement Learning-Based Traffic Signal Control
In reinforcement learning (RL), dealing with non-stationarity is a
challenging issue. However, some domains such as traffic optimization are
inherently non-stationary. Causes for and effects of this are manifold. In
particular, when dealing with traffic signal controls, addressing
non-stationarity is key since traffic conditions change over time and as a
function of traffic control decisions taken in other parts of a network. In
this paper we analyze the effects that different sources of non-stationarity
have in a network of traffic signals, in which each signal is modeled as a
learning agent. More precisely, we study both the effects of changing the
\textit{context} in which an agent learns (e.g., a change in flow rates
experienced by it), as well as the effects of reducing agent observability of
the true environment state. Partial observability may cause distinct states (in
which distinct actions are optimal) to be seen as the same by the traffic
signal agents. This, in turn, may lead to sub-optimal performance. We show that
the lack of suitable sensors to provide a representative observation of the
real state seems to affect the performance more drastically than the changes to
the underlying traffic patterns.Comment: 13 page
Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization
Multi-objective reinforcement learning (MORL) algorithms tackle sequential
decision problems where agents may have different preferences over (possibly
conflicting) reward functions. Such algorithms often learn a set of policies
(each optimized for a particular agent preference) that can later be used to
solve problems with novel preferences. We introduce a novel algorithm that uses
Generalized Policy Improvement (GPI) to define principled, formally-derived
prioritization schemes that improve sample-efficient learning. They implement
active-learning strategies by which the agent can (i) identify the most
promising preferences/objectives to train on at each moment, to more rapidly
solve a given MORL problem; and (ii) identify which previous experiences are
most relevant when learning a policy for a particular agent preference, via a
novel Dyna-style MORL method. We prove our algorithm is guaranteed to always
converge to an optimal solution in a finite number of steps, or an
-optimal solution (for a bounded ) if the agent is limited
and can only identify possibly sub-optimal policies. We also prove that our
method monotonically improves the quality of its partial solutions while
learning. Finally, we introduce a bound that characterizes the maximum utility
loss (with respect to the optimal solution) incurred by the partial solutions
computed by our method throughout learning. We empirically show that our method
outperforms state-of-the-art MORL algorithms in challenging multi-objective
tasks, both with discrete and continuous state and action spaces.Comment: Accepted to AAMAS 202
Analysis of single nucleotide polymorphisms in the FAS and CTLA-4 genes of peripheral T-cell lymphomas
Angioimmunoblastic T-cell lymphoma (AILT) represents a subset of T-cell lymphomas but resembles an autoimmune disease in many of its clinical aspects. Despite the phenotype of effector T-cells and high expression of FAS and CTLA-4 receptor molecules, tumor cells fail to undergo apoptosis. We investigated single nucleotide polymorphisms (SNPs) of the FAS and CTLA-4 genes in 94 peripheral T-cell lymphomas. Although allelic frequencies of some FAS SNPs were enriched in AILT cases, none of these occurred at a different frequency compared to healthy individuals. Therefore, SNPs in these genes are not associated with the apoptotic defect and autoimmune phenomena in AILT
Parameterized Melody Generation with Autoencoders and Temporally-Consistent Noise
We introduce a machine learning technique to autonomously generate novel melodies that are variations of an arbitrary base melody. These are produced by a neural network that ensures that (with high probability) the melodic and rhythmic structure of the new melody is consistent with a given set of sample songs. We train a Variational Autoencoder network to identify a low-dimensional set of variables that allows for the compression and representation of sample songs. By perturbing these variables with Perlin Noiseâ a temporally-consistent parameterized noise functionâit is possible to generate smoothly-changing novel melodies. We show that (1) by regulating the amount of noise, one can specify how much of the base song will be preserved; and (2) there is a direct correlation between the noise signal and the differences between the statistical properties of novel melodies and the original one. Users can interpret the controllable noise as a type of âcreativity knobâ: the higher it is, the more leeway the network has to generate significantly different melodies. We present a physical prototype that allows musicians to use a keyboard to provide base melodies and to adjust the networkâs âcreativity knobsâ to regulate in real-time the process that proposes new melody ideas
PĂłster: Rosario "Ciudad Candia"
El objetivo general es visibilizar la producciĂłn de la empresa, que permitirĂĄ realizar un recorrido transversal en el desarrollo arquitectĂłnico local, hilvanando perĂodos histĂłricos, proyectistas y tĂ©cnicas constructivas. Es notable como en la historiografĂa de la arquitectura prevalece la cita del proyectista, relegando a un segundo plano los hacedores que contribuyeron con su saber empĂrico y fĂĄctico a la construcciĂłn de la ciudad.Fil: Secretaria de Ciencia y TecnologĂa - Universidad Nacional de Rosario. Facultad de Arquitectura, Planeamiento y Diseño; Argentina
CARMA1 is a critical lipid raft-associated regulator of TCR-induced NF-kappa B activation.
CARMA1 is a lymphocyte-specific member of the membrane-associated guanylate kinase (MAGUK) family of scaffolding proteins, which coordinate signaling pathways emanating from the plasma membrane. CARMA1 interacts with Bcl10 via its caspase-recruitment domain (CARD). Here we investigated the role of CARMA1 in T cell activation and found that T cell receptor (TCR) stimulation induced a physical association of CARMA1 with the TCR and Bcl10. We found that CARMA1 was constitutively associated with lipid rafts, whereas cytoplasmic Bcl10 translocated into lipid rafts upon TCR engagement. A CARMA1 mutant, defective for Bcl10 binding, had a dominant-negative (DN) effect on TCR-induced NF-kappa B activation and IL-2 production and on the c-Jun NH(2)-terminal kinase (Jnk) pathway when the TCR was coengaged with CD28. Together, our data show that CARMA1 is a critical lipid raft-associated regulator of TCR-induced NF-kappa B activation and CD28 costimulation-dependent Jnk activation